8 research outputs found

    MSM/RD: Coupling Markov state models of molecular kinetics with reaction-diffusion simulations

    Get PDF
    Molecular dynamics (MD) simulations can model the interactions between macromolecules with high spatiotemporal resolution but at a high computational cost. By combining high-throughput MD with Markov state models (MSMs), it is now possible to obtain long-timescale behavior of small to intermediate biomolecules and complexes. To model the interactions of many molecules at large lengthscales, particle-based reaction-diffusion (RD) simulations are more suitable but lack molecular detail. Thus, coupling MSMs and RD simulations (MSM/RD) would be highly desirable, as they could efficiently produce simulations at large time- and lengthscales, while still conserving the characteristic features of the interactions observed at atomic detail. While such a coupling seems straightforward, fundamental questions are still open: Which definition of MSM states is suitable? Which protocol to merge and split RD particles in an association/dissociation reaction will conserve the correct bimolecular kinetics and thermodynamics? In this paper, we make the first step towards MSM/RD by laying out a general theory of coupling and proposing a first implementation for association/dissociation of a protein with a small ligand (A + B C). Applications on a toy model and CO diffusion into the heme cavity of myoglobin are reported

    Temperature steerable flows and Boltzmann generators

    Get PDF
    Boltzmann generators approach the sampling problem in many-body physics by combining a normalizing flow and a statistical reweighting method to generate samples in thermodynamic equilibrium. The equilibrium distribution is usually defined by an energy function and a thermodynamic state. Here, we propose temperature steerable flows (TSFs) which are able to generate a family of probability densities parametrized by a choosable temperature parameter. TSFs can be embedded in generalized ensemble sampling frameworks to sample a physical system across multiple thermodynamic states

    Diffusion-influenced reaction rates in the presence of pair interactions

    Full text link
    The kinetics of bimolecular reactions in solution depends, among other factors, on intermolecular forces such as steric repulsion or electrostatic interaction. Microscopically, a pair of molecules first has to meet by diffusion before the reaction can take place. In this work, we establish an extension of Doi's volume reaction model to molecules interacting via pair potentials, which is a key ingredient for interacting-particle-based reaction-diffusion (iPRD) simulations. As a central result, we relate model parameters and macroscopic reaction rate constants in this situation. We solve the corresponding reaction-diffusion equation in the steady state and derive semi-analytical expressions for the reaction rate constant and the local concentration profiles. Our results apply to the full spectrum from well-mixed to diffusion--limited kinetics. For limiting cases, we give explicit formulas, and we provide a computationally inexpensive numerical scheme for the general case, including the intermediate, diffusion-influenced regime. The obtained rate constants decompose uniquely into encounter and formation rates, and we discuss the effect of the potential on both subprocesses, exemplified for a soft harmonic repulsion and a Lennard-Jones potential. The analysis is complemented by extensive stochastic iPRD simulations, and we find excellent agreement with the theoretical predictions

    UNICON: A unified framework for behavior-based consumer segmentation in e-commerce

    Full text link
    Data-driven personalization is a key practice in fashion e-commerce, improving the way businesses serve their consumers needs with more relevant content. While hyper-personalization offers highly targeted experiences to each consumer, it requires a significant amount of private data to create an individualized journey. To alleviate this, group-based personalization provides a moderate level of personalization built on broader common preferences of a consumer segment, while still being able to personalize the results. We introduce UNICON, a unified deep learning consumer segmentation framework that leverages rich consumer behavior data to learn long-term latent representations and utilizes them to extract two pivotal types of segmentation catering various personalization use-cases: lookalike, expanding a predefined target seed segment with consumers of similar behavior, and data-driven, revealing non-obvious consumer segments with similar affinities. We demonstrate through extensive experimentation our framework effectiveness in fashion to identify lookalike Designer audience and data-driven style segments. Furthermore, we present experiments that showcase how segment information can be incorporated in a hybrid recommender system combining hyper and group-based personalization to exploit the advantages of both alternatives and provide improvements on consumer experience

    Enhanced sampling methods for molecular systems: multiscale and data-driven techniques

    Get PDF
    Simulations of molecular systems have led to significant discoveries in molecular biology. The high accuracy of these simulations enables us to understand biological functions on a molecular scale. In connection with experimental results, they have proved to be a powerful tool to investigate biological functions. While the applications for such simulations are countless, in practice it is only possible to simulate small systems due to computational limitations; reaching biologically relevant time- and length-scales is still beyond feasibility, even for the most powerful computers. This constraint is commonly known as the sampling problem. With the progress in hardware development slowing down, demand for new methods that enable reaching relevant scales is high. This thesis aims to provide new tools that help molecular simulations reach biologically relevant scales. It is split into two parts: The first part provides new methods for rate computations in reactive systems, which can consist e.g. of a protein-ligand binding, oligomerization, or protein-protein association. The first method combines Markov state models of molecular kinetics with particle-based reaction-diffusion (PBRD) to generate a coarse-grained simulation of interacting molecules. This method conserves the characteristic kinetics of the interactions - at atomistic detail - observed in molecular dynamics simulations of the interacting molecules in close proximity. Furthermore, a method is introduced to provide realistic parameters for PBRD simulations. In particular, it enables for tuning the microscopic parameters of PBRD simulations such that experimentally obtained rates are reproduced in the dilute limit. This provides a well-defined starting point to study effects such as crowding, which are common at the cellular scale. The second part provides new methods based on Markov chain Monte Carlo. These can be utilized to speed up the generation of equilibrium samples from the Boltzmann distribution and thus enabling faster computation of stationary observables. In biological systems, it is often observed that high barriers in the free energy landscape dramatically slow down the sampling process. To speed up computations, a whole range of methods has been developed. The latest advancements are facilitated by the recent rise of machine learning research, which provides new promising tools to approach the sampling problem from completely different angles. In this spirit a new method is introduced that aims for directly proposing transitions between regions of high populations in phase space, thus directly jumping over energetic barriers. These long-range moves are proposed by a neural network trained to generate high-efficiency moves, allowing for circumventing the slow transitions across energy barriers altogether. A second proposed method is based on the recently developed Boltzmann Generators and aims to combine these with parallel tempering in order to speed up sampling significantly. To this end, a machine learning technique is employed which generates samples close to the Boltzmann distribution at different temperatures. In both of these methods, the convergence to the correct distribution is ensured by enforcing detailed balance.Simulationen molekularer Systeme haben zu bedeutenden Entdeckungen in der Molekularbiologie geführt. Die hohe Genauigkeit dieser Simulationen ermöglicht es, biologische Prozesse auf molekularer Ebene zu verstehen. In Verbindung mit Experimenten haben sie sich als leistungsfähiges Werkzeug zur Untersuchung biologischer Funktionen erwiesen. Während die Anwendungen für solche Simulationen zahllos sind, ist es in der Praxis aufgrund von beschränkter Rechenleistung nur möglich, kleine Systeme zu simulieren. Das Erreichen biologisch relevanter Zeit- und Längenskalen ist selbst für die leistungsstärksten Computer noch nicht möglich. Diese Einschränkung wird allgemein als Samplingproblem bezeichnet. Da sich die Fortschritte in der Hardwareentwicklung verlangsamen, ist die Nachfrage nach neuen Methoden, die es ermöglichen, relevante Größenordnungen zu erreichen, groß. Diese Dissertation zielt darauf ab, neue Werkzeuge bereitzustellen, die molekularen Simulationen helfen, biologisch relevante Größenordnungen zu erreichen. Sie ist in zwei Teile aufgeteilt: Der erste Teil stellt neue Methoden zur Berechnung von Raten in reaktiven Systemen vor, in diesem Kontext bestehen diese z.B. aus Protein-Ligand-Bindung, Oligomerisierung oder Protein-Protein-Assoziation. Die erste Methode kombiniert Markov-Modelle von molekularer Kinetik mit partikelbasierter Reaktionsdiffusion (PBRD), um die wechselwirkenden Moleküle auf gröberen Skalen zu simulieren. Diese Methode bewahrt die charakteristische Kinetik der Wechselwirkungen im atomaren Detail, die in Molekulardynamiksimulationen der Moleküle in unmittelbarer Nähe beobachtet wird. Darüber hinaus wird eine Methode vorgestellt, um realistische Parameter für PBRD-Simulationen zu berechnen. Insbesondere ermöglicht dies, die mikroskopischen Parameter von PBRD-Simulationen so abzustimmen, dass experimentell ermittelte Raten im verdünnten Limit reproduziert werden. Dies bietet einen wohldefinierten Startpunkt, um Effekte wie Crowding zu untersuchen, die auf zellulärer Ebene üblich sind. Der zweite Teil bietet neue Methoden basierend auf Monte-Carlo Methoden. Diese ermöglichen es, das Erzeugen von Gleichgewichtsproben aus der Boltzmann-Verteilung zu beschleunigen und somit stationäre Observablen effizienter zu berechnen. In biologischen Systemen wird oft beobachtet, dass hohe Barrieren in der freien Energie das Erzeugen von Stichproben dramatisch verlangsamt. Um dies zu beschleunigen, wurden eine ganze Reihe von Methoden entwickelt. Die jüngsten Entwicklungen in der Forschung zum maschinellen Lernen bietet neue vielversprechende Ansätze, um das Sampling von stationären Observablen aus ganz anderen Blickwinkeln zu betrachten. In diesem Sinne wird eine neue Methode eingeführt, die darauf abzielt, direkt Übergänge zwischen Regionen mit hoher Population im Phasenraum vorzuschlagen und damit energetische Barrieren direkt zu überspringen. Diese weitreichenden Vorschläge werden von einem neuronalen Netzwerk erzeugt, das darauf trainiert ist, hocheffiziente Vorschläge zu erzeugen. Ein zweites Verfahren basiert auf den kürzlich entwickelten Boltzmann-Generatoren und zielt darauf ab, diese mit Parallel Tempering zu kombinieren. Dazu wird maschinelles Lernen verwendet, um Proben nahe der Boltzmann-Verteilung bei verschiedenen Temperaturen zu erzeugen. Bei beiden Verfahren wird die Konvergenz zur korrekten Verteilung durch die Einhaltung des detaillierten Gleichgewichts sichergestellt

    Multiscale molecular kinetics by coupling Markov state models and reaction-diffusion dynamics

    Get PDF
    A novel approach to simulate simple protein-ligand systems at large time- and length-scales is to couple Markov state models (MSMs) of molecular kinetics with particle-based reaction-diffusion (RD) simulations, MSM/RD. Currently, MSM/RD lacks a mathematical framework to derive coupling schemes; is limited to isotropic ligands in a single conformational state, and is lacking a multi-particle extensions. In this work, we address these needs by developing a general MSM/RD framework by coarse-graining molecular dynamics into hybrid switching diffusion processes. Given enough data to parametrize the model, it is capable of modeling protein-protein interactions over large time- and length-scales, and it can be extended to handle multiple molecules. We derive the MSM/RD framework, and we implement and verify it for two protein-protein benchmark systems and one multiparticle implementation to model the formation of pentameric ring molecules. To enable reproducibility, we have published our code in the MSM/RD software package
    corecore